大型语言模型开发的最新进展导致公众访问最先进的预训练的语言模型(PLM),包括生成培训的预训练的变压器3(GPT-3)(GPT-3)和Transformers(来自Transformers)的双向编码器(伯特)。但是,实际上,对PLM的评估表明,在培训和开发的微调阶段,它们对对抗性攻击的敏感性。这种攻击可能导致错误的输出,模型生成的仇恨言论以及用户敏感信息的暴露。尽管现有的研究集中在PLM的培训或微调期间的对抗攻击上,但有关这两个发展阶段之间攻击的信息不足。在这项工作中,我们重点介绍了GPT-3公开发行的主要安全漏洞,并进一步研究了其他最先进的PLM中的这种漏洞。我们将工作限制在没有经过微调的预培训模型中。此外,我们强调了令牌距离最小化的扰动作为一种有效的对抗方法,绕过受监督和无监督的质量措施。遵循这种方法,在评估语义相似性时,我们观察到文本分类质量的显着降低。
translated by 谷歌翻译
Mitotic activity is key for the assessment of malignancy in many tumors. Moreover, it has been demonstrated that the proportion of abnormal mitosis to normal mitosis is of prognostic significance. Atypical mitotic figures (MF) can be identified morphologically as having segregation abnormalities of the chromatids. In this work, we perform, for the first time, automatic subtyping of mitotic figures into normal and atypical categories according to characteristic morphological appearances of the different phases of mitosis. Using the publicly available MIDOG21 and TUPAC16 breast cancer mitosis datasets, two experts blindly subtyped mitotic figures into five morphological categories. Further, we set up a state-of-the-art object detection pipeline extending the anchor-free FCOS approach with a gated hierarchical subclassification branch. Our labeling experiment indicated that subtyping of mitotic figures is a challenging task and prone to inter-rater disagreement, which we found in 24.89% of MF. Using the more diverse MIDOG21 dataset for training and TUPAC16 for testing, we reached a mean overall average precision score of 0.552, a ROC AUC score of 0.833 for atypical/normal MF and a mean class-averaged ROC-AUC score of 0.977 for discriminating the different phases of cells undergoing mitosis.
translated by 谷歌翻译
结肠镜检查被广泛认为是早期检测结直肠癌(CRC)的金标准程序。分割对于两种重要的临床应用,即病变检测和分类很有价值,提供了提高准确性和鲁棒性的手段。结肠镜检查中息肉的手动分割是耗时的。结果,使用深度学习(DL)进行息肉的自动化已经变得很重要。但是,基于DL的解决方案可能容易受到过度拟合的影响,因此无法推广到不同结肠镜捕获的图像。最新的基于变压器的语义分割的体系结构既实现更高的性能又比替代方案更好,但是通常可以预测$ \ frac {h} {4} \ times \ times \ frac {w} {4} {4} $ apatial dimensions的分割图h \ times w $输入图像。为此,我们提出了一种用于全尺寸分割的新体系结构,该结构利用了变压器在主要分支中提取最重要的特征的优势,同时用二级全卷积分支全面预测其限制了其局限性。然后将两个分支的最终功能融合,以最终预测$ h \ times w $分段地图。我们在KVASIR-SEG和CVC-ClinicDB数据集基准上都证明了我们方法相对于MDICE,MIOU,MPRECISION和MRECALL METICS的最先进性能。此外,我们在每个数据集上训练模型,并对另一个数据集进行评估以证明其出色的概括性能。
translated by 谷歌翻译
我们实施了两个不同的三维深度学习神经网络,并评估了它们在非对比度计算机断层扫描(CT)上看到的颅内出血(ICH)的能力。一种模型,称为“沿正交关注u-net沿正交级别的素隔离”(Viola-Unet),其体系结构元素可适应2022年实例的数据挑战。第二个比较模型是从No-New U-NET(NNU-NET)得出的。输入图像和地面真理分割图用于以监督方式分别训练两个网络。验证数据随后用于半监督培训。在5倍交叉验证期间比较了模型预测。中提琴 - UNET的表现优于四个性能指标中的两个(即NSD和RVD)的比较网络。将中提琴和NNU-NET网络组合的合奏模型在DSC和HD方面的性能最高。我们证明,与3D U-NET相关的ICH分割性能优势有效地合并了U-NET的解码分支期间的空间正交特征。 Viola-Unet AI工具的代码基础,预估计的权重和Docker图像将在https://github.com/samleoqh/viola-unet上公开获得。
translated by 谷歌翻译
当与分支和界限结合使用时,结合的传播方法是正式验证深神经网络(例如正确性,鲁棒性和安全性)的最有效方法之一。但是,现有作品无法处理在传统求解器中广泛接受的切割平面限制的一般形式,这对于通过凸出凸松弛的加强验证者至关重要。在本文中,我们概括了结合的传播程序,以允许添加任意切割平面的约束,包括涉及放宽整数变量的限制,这些变量未出现在现有的结合传播公式中。我们的广义结合传播方法GCP-crown为应用一般切割平面方法}开辟了一个机会进行神经网络验证,同时受益于结合传播方法的效率和GPU加速。作为案例研究,我们研究了由现成的混合整数编程(MIP)求解器生成的切割平面的使用。我们发现,MIP求解器可以生成高质量的切割平面,以使用我们的新配方来增强基于界限的验证者。由于以分支为重点的绑定传播程序和切削平面的MIP求解器可以使用不同类型的硬件(GPU和CPU)并行运行,因此它们的组合可以迅速探索大量具有强切割平面的分支,从而导致强大的分支验证性能。实验表明,与VNN-Comp 2021中最佳工具相比,我们的方法是第一个可以完全求解椭圆形的基准并验证椭圆21基准的两倍的验证者,并且在oval21基准测试中的最佳工具也明显超过了最先进的验证器。广泛的基准。 GCP-Crown是$ \ alpha $,$ \ beta $ -Crown验证者,VNN-COMP 2022获奖者的一部分。代码可在http://papercode.cc/gcp-crown上获得
translated by 谷歌翻译
拓扑数据分析(TDA)是来自数据科学和数学的工具,它开始在环境科学领域引起波浪。在这项工作中,我们寻求对TDA工具的直观且可理解的介绍,该工具对于分析图像(即持续存在同源性)特别有用。我们简要讨论理论背景,但主要关注理解该工具的输出并讨论它可以收集的信息。为此,我们围绕着一个指导示例进行讨论,该指导示例是对RASP等人研究的糖,鱼类,花朵和砾石数据集进行分类。 al。 2020年(Arxiv:1906:01906)。我们证明了如何使用简单的机器学习算法来获得良好的结果,并详细探讨了如何用图像级特征来解释这种行为。持续同源性的核心优势之一是它的解释性是可解释的,因此在本文中,我们不仅讨论了我们发现的模式,而且要考虑到为什么我们对持续性同源性理论的了解,因此可以期待这些结果。我们的目标是,本文的读者将更好地了解TDA和持续的同源性,能够确定自己的问题和数据集,为此,持续的同源性可能会有所帮助,并从应用程序中获得对结果的理解包括GitHub示例代码。
translated by 谷歌翻译
口腔上皮发育不良(OED)是对口腔的病变给出的恶性肿瘤性组织病理学诊断。预测OED等级或情况是否将转型给恶性肿瘤对于早期检测和适当的治疗至关重要。 OED通常从上皮的下三分之一开始,然后以等级的严重程度向上逐步开始,因此我们提出了分割上皮层,除了单独的细胞核之外,还可以使研究人员能够评估级别/恶性预测的重要层种形态特征。我们呈现悬停网+,深度学习框架,以同时分段(和分类)核和(内部)在H&E染色的载玻片中的核和(内)上皮层。所提出的架构由编码器分支和四个解码器分支组成,用于同时对上皮层的核和语义分割的同时分段。我们表明,拟议的模型在两个任务中实现了最先进的(SOTA)性能,而与每个任务的先前的SOTA方法相比,没有额外的成本。据我们所知,我们的是同时核实例分割和语义组织分割的第一种方法,具有用于其他类似同时任务的计算病理和对恶性预测的研究。
translated by 谷歌翻译
We demonstrate a proof-of-concept of a large language model conducting corporate lobbying related activities. We use an autoregressive large language model (OpenAI's text-davinci-003) to determine if proposed U.S. Congressional bills are relevant to specific public companies and provide explanations and confidence levels. For the bills the model deems as relevant, the model drafts a letter to the sponsor of the bill in an attempt to persuade the congressperson to make changes to the proposed legislation. We use hundreds of ground-truth labels of the relevance of a bill to a company to benchmark the performance of the model, which outperforms the baseline of predicting the most common outcome of irrelevance. However, we test the ability to determine the relevance of a bill with the previous OpenAI GPT-3 model (text-davinci-002), which was state-of-the-art on many language tasks until text-davinci-003 was released on November 28, 2022. The performance of text-davinci-002 is worse than simply always predicting that a bill is irrelevant to a company. These results suggest that, as large language models continue to improve core natural language understanding capabilities, performance on corporate lobbying related tasks will continue to improve. We then discuss why this could be problematic for societal-AI alignment.
translated by 谷歌翻译
In the past years, deep learning has seen an increase of usage in the domain of histopathological applications. However, while these approaches have shown great potential, in high-risk environments deep learning models need to be able to judge their own uncertainty and be able to reject inputs when there is a significant chance of misclassification. In this work, we conduct a rigorous evaluation of the most commonly used uncertainty and robustness methods for the classification of Whole-Slide-Images under domain shift using the H\&E stained Camelyon17 breast cancer dataset. Although it is known that histopathological data can be subject to strong domain shift and label noise, to our knowledge this is the first work that compares the most common methods for uncertainty estimation under these aspects. In our experiments, we compare Stochastic Variational Inference, Monte-Carlo Dropout, Deep Ensembles, Test-Time Data Augmentation as well as combinations thereof. We observe that ensembles of methods generally lead to higher accuracies and better calibration and that Test-Time Data Augmentation can be a promising alternative when choosing an appropriate set of augmentations. Across methods, a rejection of the most uncertain tiles leads to a significant increase in classification accuracy on both in-distribution as well as out-of-distribution data. Furthermore, we conduct experiments comparing these methods under varying conditions of label noise. We observe that the border regions of the Camelyon17 dataset are subject to label noise and evaluate the robustness of the included methods against different noise levels. Lastly, we publish our code framework to facilitate further research on uncertainty estimation on histopathological data.
translated by 谷歌翻译
In large-scale machine learning, recent works have studied the effects of compressing gradients in stochastic optimization in order to alleviate the communication bottleneck. These works have collectively revealed that stochastic gradient descent (SGD) is robust to structured perturbations such as quantization, sparsification, and delays. Perhaps surprisingly, despite the surge of interest in large-scale, multi-agent reinforcement learning, almost nothing is known about the analogous question: Are common reinforcement learning (RL) algorithms also robust to similar perturbations? In this paper, we investigate this question by studying a variant of the classical temporal difference (TD) learning algorithm with a perturbed update direction, where a general compression operator is used to model the perturbation. Our main technical contribution is to show that compressed TD algorithms, coupled with an error-feedback mechanism used widely in optimization, exhibit the same non-asymptotic theoretical guarantees as their SGD counterparts. We then extend our results significantly to nonlinear stochastic approximation algorithms and multi-agent settings. In particular, we prove that for multi-agent TD learning, one can achieve linear convergence speedups in the number of agents while communicating just $\tilde{O}(1)$ bits per agent at each time step. Our work is the first to provide finite-time results in RL that account for general compression operators and error-feedback in tandem with linear function approximation and Markovian sampling. Our analysis hinges on studying the drift of a novel Lyapunov function that captures the dynamics of a memory variable introduced by error feedback.
translated by 谷歌翻译